Word confusability prediction in automatic speech recognition

نویسندگان

  • Jan Anguita
  • Stéphane Peillon
  • Javier Hernando
  • Alexandre Bramoulle
چکیده

A new method to predict if two words are likely to be confused by an Automatic Speech Recognition (ASR) system is presented in this paper. A new inter-word dissimilarity measure based on Dynamic Time Warping (DTW) is used to classify the word pairs as confusable or not confusable. Firstly, the phonetic transcriptions of the two words to compare are aligned using only phonetic information. After the alignment, the accumulated distance is obtained with a new inter-phone acoustic distance calculated between the Hidden Markov Models (HMM) of the phones. In addition, we have used two different kinds of alignment: either with or without insertions and omissions. In order to evaluate the performance, we introduce a classical false acceptance/false rejection framework for comparing a posteriori classification obtained by testing ASR systems with the a priori classification produced by the method. The prediction Equal Error Rate (EER) was measured to be 1.6%, a 50% of reduction with respect to the conventional DTW distance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonemic variability and confusability in pronunciation modeling for automatic speech recognition

“Phonemic variability and confusability in pronunciation modeling for automatic speech recognition”

متن کامل

Inter-Phone and Inter-Word Distances for Confusability Prediction in Speech Recognition

In this work we investigate new inter-phone and inter-word distances and we apply them to predict if two words of the lexicon of an Automatic Speech Recognition (ASR) system are likely to be confused. The inter-word distance is calculated from an alignment between the phonetic transcriptions of the words by adding the distances between the aligned phones. We bring a new solution in which the in...

متن کامل

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

 In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...

متن کامل

Predicting word correct rate from acoustic and linguistic confusability

When adapting an existing ASR-application for different user environments, one often gets confronted with speech that does not entirely match the training situation. Differences may stem both from acoustic and linguistic causes. In this paper we explore to what extent the word correct rate (wcr) for a given test set can be predicted from the transcription only (i.e. the linguistic representatio...

متن کامل

Word confusability - measuring hidden Markov model similarity

We address the problem of word confusability in speech recognition by measuring the similarity between Hidden Markov Models (HMMs) using a number of recently developed techniques. The focus is on defining a word confusability that is accurate, in the sense of predicting artificial speech recognition errors, and computationally efficient when applied to speech recognition applications. It is sho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004